Import Statements for importing all the module and the python files which we have created for our project.
Define a variable for image size for the image which we will be giving to the model (after preprocessing)
Define a variable for the total number of classes (Traffic Signs) that we have to classify.
from keras.utils import np_utils
import numpy as np
from random import randint
from keras.callbacks import ModelCheckpoint
import glob
from IPython.core.display import display, Image
from matplotlib import pyplot
from sklearn.utils import class_weight
from keras.callbacks import LearningRateScheduler, ModelCheckpoint
from keras.optimizers import SGD
from keras.preprocessing.image import ImageDataGenerator
from keras import backend as K
K.set_image_dim_ordering('th')
from helpers.preprocessImage import *
from helpers.imageDetails import randomImageDetails
from helpers.classDetails import datasetDetails
from helpers.classLabels import loadClassLabelsMapAndList
from helpers.simpleCNN import get_simple_cnn_model
from helpers.lenetModel import get_lenet_model
from helpers.modifiedLenet import get_modified_lenet_model
from helpers.modelPerformance import show_models_performance
from helpers.testImages import detect_image_type_lenet_model
from helpers.stratify import get_stratified_dataset
#Incoming image size after preprocessing - we resize the image to be of size 32x32
IMAGE_SIZE = 32
#Total class labels for the dataset
TOTAL_CLASSES = 43
Read and preprocess all the images from folder ./GTSRB/Final_Training/Images.
These images are from the training dataset. The readPreprocessedTrainTrafficSigns method reads the images from the folders and their corresponding labels from the csv files. It then processes the images and stores in the images in an array.
Lastly, the method returns preprocessed train images array and an array of corresponsing labels for each image.
We then convert the images array to a numpy array. We also perform one hot encoding for the labels array and store it in labels.
images_arr, labels_arr = readPreprocessedTrainTrafficSigns('./GTSRB/Final_Training/Images', IMAGE_SIZE)
images = np.array(images_arr, dtype='float32')
labels = np_utils.to_categorical(np.array(labels_arr), 43)
Read and preprocess all the images from folder ./GTSRB/Final_Test/Images.
These images are from the testing dataset. The readPreprocessedTestTrafficSigns method reads the images from the folders and their corresponding labels from the csv files. It then processes the images and stores in the images in an array.
Lastly, the method returns preprocessed test images array and an array of corresponsing labels for each image.
We then convert the test images array and test labels array to a numpy array.
test_images_arr, test_labels_arr = readPreprocessedTestTrafficSigns('./GTSRB/Final_Test/Images', IMAGE_SIZE)
test_images = np.array(test_images_arr, dtype='float32')
test_labels = np.array(test_labels_arr)
TRAIN DATASET
Let's print the details of any image from the training dataset. We have a method called randomImageDetails. This method takes some random number within the length of the images array (index for the train images array) and prints out details of the image.
For example - If random_number = 5, we print out the details of the 5th image from the train images array and labels array.
"""
Picking a random TRAIN image and printing its details -
1. Printing the image array
2. Printing the image shape
3. Printing the total rows and columns in the image
4. Printing the label for the image
We also plot the processed image
"""
random_number = randint(0, len(images))
randomImageDetails(random_number, images, labels_arr)
TEST DATASET
Let's print the details of any image from the testing dataset. This is similar to the code cell above except this is for the testing dataset.
For example - If random_number = 5, we print out the details of the 5th image from the test images array and labels array.
"""
Picking a random TEST image and printing its details -
1. Printing the image array
2. Printing the image shape
3. Printing the total rows and columns in the image
4. Printing the label for the image
We also plot the processed image
"""
random_number = randint(0, len(test_images))
randomImageDetails(random_number, test_images, test_labels_arr)
Let's analyse the dataset. We will first be mapping some meaningful traffic sign names for each class ID.
We use the following link to get some meaningful names for each kind of traffic sign image - http://www.gettingaroundgermany.info/zeichen.shtml
After mapping this, we display a graph of the frequency of images for each class ID i.e. we find how many images do we have for each class label.
Since, the number of images for each class are not the same, we have an imbalanced dataset. The graph displays an imbalanced dataset.
We also print out the frequency of images for each class, the class with the highest frequency and the class with the lowest frequency.
#Loading the mapping between class IDs and labels
class_labels_list = loadClassLabelsMapAndList('./GTSRB')
#Visualizing the class labels frequency.
#Also, prints the frequency for each class ID.
datasetDetails(labels_arr, TOTAL_CLASSES, class_labels_list)
Let's first try training a simple CNN model using our preprocessed images.
We use the test_train_split method to split the dataset into train and validation set. 80% images are used for training and 20% are used for validation. We will not be using data augmentation for this model.
We increased the number of filters with every convolutional layer which in turn increases the depth of every convolutional layer.
We alternated every convolutional layer with a max pooling layer. Pooling layers are used to reduce the spatial size of the representation to reduce the amount of parameters and computation.
I also added a global average pooling layer. This will reduce the 3D array to a vector without losing any information.
The last layer is the output layer (dense layer) and it has 43 nodes (as there are 43 types of traffic signs).
We will also print out the model's performance using our method called show_models_performance.
#Simple CNN model without data augmentation and without stratified data split
simple_model = get_simple_cnn_model(IMAGE_SIZE)
simple_model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
checkpointer = ModelCheckpoint(filepath='best_simple_cnn.hdf5', verbose=1, save_best_only=True)
simple_model_history = simple_model.fit(images, labels, validation_split=0.2, shuffle=True, epochs=15, batch_size=32, callbacks=[checkpointer], verbose=1)
simple_cnn_y_pred = simple_model.predict_classes(test_images)
show_models_performance(test_labels, simple_cnn_y_pred, TOTAL_CLASSES)
We can see the performance of our model on the validation set. We provide only 15 epochs for training. We see that the validation loss does not improve and hence, we stopped the training of the model.
We have printed the model summary.
Since, our dataset is imbalanced, accuracy is not the best metric to use for finding out how our model is performing. We will see the performance of our model using precision, recall and F-beta score. We will use the weighted average version of the method precision_recall_fscore_support.
We print out the classification report for the model.
Confusion matrix is calculated using the confusion_matrix method of sklearn.metrics. We use matplotlib and seaborn modules to plot the confusion matrix.
PLOTTING LOSS VS EPOCHS
We have displayed a plot of loss on training and validation datasets over training epochs for simple CNN model.
#Plot Loss values for training and validation
pyplot.plot(simple_model_history.history['loss'])
pyplot.plot(simple_model_history.history['val_loss'])
pyplot.title('Model loss')
pyplot.ylabel('Loss')
pyplot.xlabel('Epochs')
pyplot.legend(['Train', 'Test'], loc='upper right')
pyplot.show()
SIMPLE CNN MODEL PERFORMANCE ON IMAGES FROM THE WEB
Now, let's see how our simple CNN models performs on randomly selected images from the web.
Our detect_image_type_lenet_model prints out the prediction for the input image.
img_list = sorted(glob.glob("./test_traffic_sign_images/*"))
for img_path in img_list:
display_img = Image(img_path, width=175, height=175)
display(display_img)
detect_image_type_lenet_model(simple_model, img_path, class_labels_list, IMAGE_SIZE)
You can see that the simple CNN model does not perform so well on the images picked up from the web. It correctly classifies only 8 out of 16 images that we provide. It correctly classifies 50% of the images we provide.
The model is clearly underfitting. So, we try to improve it ahead by different techniques.
We will now be trying the Lenet-5 model and the modified Lenet-5 model.
COMMON FUNCTION CALLS FOR THE MODELS
The function calls below are some common function calls made for both the models that we will be training.
We have a method called get_stratified_dataset which in turn calls the StratifiedShuffleSplit of sklearn. Since, our dataset is imbalanced, this method will help us create a train validation dataset which is equally balanced in terms of the classes.
In addition, we will also use class weights when we fit the model. When our model comes across a training image from the class which has less samples, our model will pay more attention when calculating the loss. We calculate the class weights by using the compute_class_weights method of the sklearn.utils module.
X_train, Y_train, X_val, Y_val = get_stratified_dataset(images, labels)
class_weights = class_weight.compute_class_weight('balanced', np.unique(labels_arr), labels_arr)
print("Class weights are: {}".format(class_weights))
LENET-5 MODEL - COMPILE AND FIT THE MODEL
We are now trying out the Lenet-5 model for the dataset.
We will be using the ImageDataGenerator method to do data augmentation.
We print out the summary of the Lenet-5 model.
We will be training it for only 25 epochs, as we have noticed that the validation loss does not improve after that. So, we stop the training after 25 epochs else the model will start overfitting.
You can notice that we have used the train set and validation set which we created in the above code cell using StratifiedShuffleSplit.
We will be using the SGD optimizer as it has better generalization than adaptive optimizers. We are using a decreasing learning rate with SGD. The learning rate decreases with every epoch. This helps us take smaller steps towards the solution. As we go closer to the solution, we want to take smaller steps towards it so that we do not jump over the solution.
lr = 0.01
lenet_sgd_optimizer = SGD(lr=lr, decay=1e-6, momentum=0.9, nesterov=True)
def learning_rate_scheduler(epoch):
return lr*(0.1**int(epoch/10))
datagen_lenet = ImageDataGenerator(rotation_range=17,
width_shift_range=0.1,
height_shift_range=0.1,
shear_range=0.3,
zoom_range=0.15,
horizontal_flip=False,
fill_mode='nearest')
datagen_lenet.fit(X_train)
datagen_lenet_model = get_lenet_model(IMAGE_SIZE, TOTAL_CLASSES)
datagen_lenet_model.summary()
datagen_lenet_model.compile(optimizer=lenet_sgd_optimizer,
loss='categorical_crossentropy',
metrics=['accuracy'])
datagen_lenet_checkpointer = ModelCheckpoint(filepath='best_lenet_datagen.hdf5',
verbose=1,
save_best_only=True)
datagen_lenet_model_history = datagen_lenet_model.fit_generator(datagen_lenet.flow(X_train, Y_train, batch_size=32),
class_weight=class_weights,
steps_per_epoch=X_train.shape[0] / 8,
epochs=25,
validation_data=(X_val, Y_val),
callbacks=[LearningRateScheduler(learning_rate_scheduler), datagen_lenet_checkpointer])
PLOTTING LOSS VS EPOCHS
We have displayed a plot of loss on training and validation datasets over training epochs for the Lenet-5 model.
#Plot Loss values for training and validation
pyplot.plot(datagen_lenet_model_history.history['loss'])
pyplot.plot(datagen_lenet_model_history.history['val_loss'])
pyplot.title('Model loss')
pyplot.ylabel('Loss')
pyplot.xlabel('Epochs')
pyplot.legend(['Train', 'Test'], loc='upper right')
pyplot.show()
LENET-5 MODEL - TESTING THE MODEL
We will now be testing the Lenet-5 model on the test dataset.
We print the performance of the model using the show_models_performance function that we implemented.
Since, our dataset is imbalanced, accuracy is not the best metric to use for finding out how our model is performing. We will see the performance of our model using precision, recall and F-beta score. We will use the weighted average version of the method precision_recall_fscore_support.
We print out the classification report for the model.
Confusion matrix is calculated using the confusion_matrix method of sklearn.metrics. We use matplotlib and seaborn modules to plot the confusion matrix.
datagen_lenet_y_pred = datagen_lenet_model.predict_classes(test_images)
datagen_lenet_test_accuracy = np.sum(datagen_lenet_y_pred == test_labels)/np.size(datagen_lenet_y_pred)
show_models_performance(test_labels, datagen_lenet_y_pred, TOTAL_CLASSES)
LENET-5 MODEL PERFORMANCE FOR IMAGES FROM THE WEB
Now, let's see how our Lenet-5 model performs on randomly selected images from the web.
Our detect_image_type_lenet_model prints out the prediction for the input image.
img_list = sorted(glob.glob("./test_traffic_sign_images/*"))
for img_path in img_list:
display_img = Image(img_path, width=175, height=175)
display(display_img)
detect_image_type_lenet_model(datagen_lenet_model, img_path, class_labels_list, IMAGE_SIZE)
You can see that the Lenet-5 model performed better on the images picked up from the web. It correctly classifies 14 out of 16 images that we provide. It correctly classifies 87.50% of the images we provide.
LENET-5 MODIFIED MODEL - COMPILE AND FIT THE MODEL
We are now trying out the Lenet-5 modified model for the dataset.
Each filter in CNN is a feature map. Increasing the filters helps in capturing more detailed features of the dataset. We have modified it by increasing the number of filters in the first and the second convolutional layers. 1st convolutional layer - Number of filters increased from 6 to 12 2nd convolutional layer - Number of filters increased from 16 to 32
We also added a dropout layer after the flatten layer. Dropout is a regularization technique. It helps avoid overfitting by ignoring randomly selected neurons when the model is getting trained. This helps in building a network which generalizes better.
We will be using the ImageDataGenerator method to do data augmentation.
We print out the summary of the Lenet-5 model.
We will be training it for only 30 epochs, as we have noticed that the validation loss does not improve after that. So, we stop the training after 30 epochs else the model will start overfitting.
You can notice that we have used the train set and validation set which we created in the above code cell using StratifiedShuffleSplit.
We will be using the SGD optimizer as it has better generalization than adaptive optimizers. We are using a decreasing learning rate with SGD. The learning rate decreases with every epoch. This helps us take smaller steps towards the solution. As we go closer to the solution, we want to take smaller steps towards it so that we do not jump over the solution.
lr = 0.01
lenet_sgd_optimizer = SGD(lr=lr, decay=1e-6, momentum=0.9, nesterov=True)
def learning_rate_scheduler(epoch):
return lr*(0.1**int(epoch/10))
datagen_modified_lenet = ImageDataGenerator(rotation_range=17,
width_shift_range=0.1,
height_shift_range=0.1,
shear_range=0.3,
zoom_range=0.15,
horizontal_flip=False,
fill_mode='nearest')
datagen_modified_lenet.fit(X_train)
datagen_modified_lenet_model = get_modified_lenet_model(IMAGE_SIZE, TOTAL_CLASSES)
datagen_modified_lenet_model.summary()
datagen_modified_lenet_model.compile(optimizer=lenet_sgd_optimizer,
loss='categorical_crossentropy',
metrics=['accuracy'])
datagen_modified_lenet_checkpointer = ModelCheckpoint(filepath='best_lenet_modified_datagen.hdf5',
verbose=1,
save_best_only=True)
datagen_modified_lenet_model_history = datagen_modified_lenet_model.fit_generator(datagen_modified_lenet.flow(X_train, Y_train, batch_size=32),
class_weight=class_weights,
steps_per_epoch=X_train.shape[0] / 8,
epochs=30,
validation_data=(X_val, Y_val),
callbacks=[LearningRateScheduler(learning_rate_scheduler), datagen_modified_lenet_checkpointer])
PLOTTING LOSS VS EPOCHS
We have displayed a plot of loss on training and validation datasets over training epochs for the Lenet-5 modified model.
#Plot Loss values for training and validation
pyplot.plot(datagen_modified_lenet_model_history.history['loss'])
pyplot.plot(datagen_modified_lenet_model_history.history['val_loss'])
pyplot.title('Model loss')
pyplot.ylabel('Loss')
pyplot.xlabel('Epochs')
pyplot.legend(['Train', 'Test'], loc='upper right')
pyplot.show()
LENET-5 MODIFIED MODEL - TESTING THE MODEL
We will now be testing the Lenet-5 modified model on the test dataset.
We print the performance of the model using the show_models_performance function that we implemented.
Since, our dataset is imbalanced, accuracy is not the best metric to use for finding out how our model is performing. We will see the performance of our model using precision, recall and F-beta score. We will use the weighted average version of the method precision_recall_fscore_support.
We print out the classification report for the model.
Confusion matrix is calculated using the confusion_matrix method of sklearn.metrics. We use matplotlib and seaborn modules to plot the confusion matrix.
datagen_modified_lenet_y_pred = datagen_modified_lenet_model.predict_classes(test_images)
datagen_modified_lenet_test_accuracy = np.sum(datagen_modified_lenet_y_pred == test_labels)/np.size(datagen_modified_lenet_y_pred)
show_models_performance(test_labels, datagen_modified_lenet_y_pred, TOTAL_CLASSES)
LENET-5 MODIFIED MODEL PERFORMANCE FOR IMAGES FROM THE WEB
Now, let's see how our Lenet-5 modified model performs on randomly selected images from the web.
Our detect_image_type_lenet_model prints out the prediction for the input image.
img_list = sorted(glob.glob("./test_traffic_sign_images/*"))
for img_path in img_list:
display_img = Image(img_path, width=175, height=175)
display(display_img)
detect_image_type_lenet_model(datagen_modified_lenet_model, img_path, class_labels_list, IMAGE_SIZE)
You can see that the Lenet-5 modified model performed better than Lenet-5 model on the images picked up from the web. It correctly classifies 15 out of 16 images that we provide. It correctly classifies 93.75% of the images we provide.